[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [patch]Make xend to take care of dead qemu-dm process
Hi Kan, Thanks for your comment. Correctted that,and some other mistakes. Please review this patch again. thanks xiaowei --- tools/python/xen/xend/server/SrvServer.py.org 2008-05-21 13:53:08.000000000 +0800 +++ tools/python/xen/xend/server/SrvServer.py 2008-05-21 15:28:09.000000000 +0800 @@ -44,6 +44,7 @@ import re import time import signal +import os from threading import Thread from xen.web.httpserver import HttpServer, UnixHttpServer @@ -148,14 +149,26 @@ # Reaching this point means we can auto start domains try: - xenddomain().autostart_domains() + dom = xenddomain() + dom.autostart_domains() except Exception, e: log.exception("Failed while autostarting domains") # loop to keep main thread alive until it receives a SIGTERM self.running = True while self.running: - time.sleep(100000000) + # loop to destroy those hvm domain that whoes DM has dead unexpectedly. + for item in dom.domains.values(): + if item.info.is_hvm(): + device_model_pid = item.gatherDom(('image/device-model-pid', str)) + dm_stat_cmd = "ps -o stat --no-headers -p"+device_model_pid + dm_stat = os.popen(dm_stat_cmd).readline().rstrip() + if dm_stat == 'Z': + log.warn("Devices Model for domain " + str(item.domid) + "was killed unexpectedly") + item.destroy() + else: + continue + time.sleep(30) if self.reloadingConfig: log.info("Restarting all XML-RPC and Xen-API servers...") On Wed, 2008-05-21 at 16:21 +0900, Masaki Kanno wrote: > Hi Xiaowei, > > Nonessential comment. > > Your patch includes both tab-indent and space-indent. > Could you change tab-indent to space-indent? > > Best regards, > Kan > > Wed, 21 May 2008 14:33:05 +0800, shawn wrote: > > >Hi all, > > > >There is a problem in Xen now,When fatal error happened on VM like > >qemu-dm process died, xend should take care of it. Don't leave it as > >defunct process (zombie process). > >Because of mis-configuration or some other reason, qemu-dm process would > >die. > > > >For now, xend haven't taken care of this dead process and it remains as > >defunct process, and xm list shows VM status assigned to the process as > >vserv1134 5 6144 1 ------ 0.0 > > > >This patch fix xend as when fatal error happened (e.g. qemu-dm process > >was killed) > >log error meesge then destroy that domain and clean up the process (no > >zombies) > > > >This is caused by the xend daemon, xend forks a process and run the > >qemu-dm program, when qemu-dm was killed directly ,xend doesn't have a > >chance to call > >the wait() function to collect this zombie child(qemu-dm is executed by > >a thread).For the xend doesn't have any idea of the qemu-dm child is > >alive or being killed. > > > >For the above reason,added some code in xend to check those hvm DM > >status each 30 seconds. > > > >Have made a patch based on the open source xen3.2.1 source code. > > > >Please review this patch. > > > >Thanks. > >Xiaowei > > > >--- xen-3.2.1/tools/python/xen/xend/server/SrvServer.py.org 2008-05-21 > >13:53:08.000000000 +0800 > >+++ xen-3.2.1/tools/python/xen/xend/server/SrvServer.py 2008-05-21 > >13:58:56.000000000 +0800 > >@@ -44,6 +44,7 @@ > > import re > > import time > > import signal > >+import os > > from threading import Thread > > > > from xen.web.httpserver import HttpServer, UnixHttpServer > >@@ -148,14 +149,28 @@ > > > > # Reaching this point means we can auto start domains > > try: > >- xenddomain().autostart_domains() > >+ dom = xenddomain() > >+ dom.autostart_domains() > > except Exception, e: > > log.exception("Failed while autostarting domains") > > > > # loop to keep main thread alive until it receives a > >SIGTERM > > self.running = True > > while self.running: > >- time.sleep(100000000) > >+ # loop to destroy those hvm domain that whoes DM has dead > >unexpectedly. > >+ for item in dom.domains.values(): > >+ if item.info.is_hvm(): > >+ device_model_pid = > >item.gatherDom(('image/device-model-pid', str)) > >+ dm_stat_cmd = "ps -o stat --no-headers -p"+ > device_model_pid > >+ dm_stat = os.popen(dm_stat_cmd).readline().rstrip() > >+ log.info("status of the command is:" + dm_stat + "end > of output") > >+ if dm_stat == 'Z': > >+ log.info("status of the command is:" + dm_stat + > "end of > >output") > >+ log.warn("Devices Model for " + str(item) + "was > killed > >unexpectedly") > >+ item.destroy() > >+ else: > >+ continue > >+ time.sleep(30) > > > > if self.reloadingConfig: > > log.info("Restarting all XML-RPC and Xen-API > >servers...") > > > > > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@xxxxxxxxxxxxxxxxxxx > >http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |