[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] A fix for the xend restart problems (2.0.x)
The basic problem, which from the list archives it seems that I'm not the only one running into: the first time xend is restarted (while there are any guests running), it immediately dies on an exception along the lines of "Invalid backend domain" after destroying one of the domU's. Further attempts to restart it get a "Failed to map domain control interface" -- unless the dom0 kernel is NetBSD with DIAGNOSTICS, in which case it panics. After far too much time assuming this was a NetBSD-specific problem, I eventually tracked it down in xend, and have this patch, which probably isn't the Right solution, but nonetheless works: --- tools/python/xen/xend/XendDomain.py.orig 2005-08-13 01:54:56.000000000 -0400 +++ tools/python/xen/xend/XendDomain.py 2005-08-13 01:55:17.000000000 -0400 @@ -147,7 +147,10 @@ domid = str(d['dom']) doms[domid] = d dlist = [] - for config in self.domain_db.values(): + domkeys = map(int, self.domain_db.keys()) + domkeys.sort() + for domkey in domkeys: + config = self.domain_db.get(str(domkey)) domid = str(sxp.child_value(config, 'id')) if domid in doms: d_dom = self._new_domain(config, doms[domid]) This change in traversal order avoids the exception shown below, when the domU's info is being reconstructed, and its devices' backend domain (here, dom0) is looked up -- but doesn't appear to exist yet, because it hasn't been restored from the state files (or by querying the hypervisor, for that matter) yet. I assume it's due to code reuse with a domain's actual creation that the exception causes xend to try to destroy the domain after this fails. The idea of the above patch, then, is to restore the domains' state in the same order as they were created. This is the trace of the exception in question -- normally it gets caught partway up and the "invalid backend domain" exception is thrown from there, but I commented out the try/except so I could see that first exception: Traceback (most recent call last): File "/usr/local/sbin/xend", line 121, in ? sys.exit(main()) File "/usr/local/sbin/xend", line 107, in main return daemon.start() File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDaemon.py", line 525, in start File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDaemon.py", line 615, in run File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvServer.py", line 47, in create File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvRoot.py", line 29, in __init__ File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDir.py", line 69, in get File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDir.py", line 39, in getobj File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDomainDir.py", line 25, in __init__ File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 800, in instance inst = XendDomain() File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 65, in __init__ self.initial_refresh() File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 154, in initial_refresh d_dom = self._new_domain(config, doms[domid]) File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 189, in _new_domain deferred = XendDomainInfo.vm_recreate(savedinfo, info) File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 218, in vm_recreate d = vm.construct(config) File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 456, in construct deferred = self.configure() File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 975, in configure d = self.create_devices() File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 803, in create_devices v = dev_handler(self, dev, dev_index) File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1110, in vm_dev_vif defer = ctrl.attachDevice(vif, val, recreate=recreate) File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 423, in attachDevice dev = self.addDevice(vif, config) File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 400, in addDevice dev = NetDev(vif, self, config) File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 105, in __init__ self.configure(config) File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 150, in configure self.backendDomain = int(xd.domain_lookup(sxp.child_value(config, 'backend', '0')).id) File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 430, in domain_lookup raise XendError('invalid domain:' + name) xen.xend.XendError.XendError: invalid domain:0 -- (let ((C call-with-current-continuation)) (apply (lambda (x y) (x y)) (map ((lambda (r) ((C C) (lambda (s) (r (lambda l (apply (s s) l)))))) (lambda (f) (lambda (l) (if (null? l) C (lambda (k) (display (car l)) ((f (cdr l)) (C k))))))) '((#\J #\d #\D #\v #\s) (#\e #\space #\a #\i #\newline))))) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |