[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] libxc: Protect xc_domain_resume from clobbering domain registers
On 17/05/14 17:01, Jason Andryuk wrote: > xc_domain_resume() expects the guest to be in state SHUTDOWN_suspend. > However, nothing verifies the state before modify_returncode() modifies > the domain's registers. This will crash guest processes or the kernel > itself. > > This can be demonstrated with `LIBXL_SAVE_HELPER=/bin/false xl migrate`. > > Signed-off-by: Jason Andryuk <andryuk@xxxxxxxx> Hmm. There is no possible way whatsoever that migration can work if a PV guest is not in SHUTDOWN_suspend. PV guests have to leave an MFN in edx which the toolstack rewrites with a new MFN on resume. By default, there is no need for knowledge from the HVM guest for migrate. XenServer is perfectly capable of migrating HVM VMs without PV drivers. I suspect therefore that we never use cooperative resume. This cooperative resume which modifies guest register state therefore imposes the same SHUTDOWN_suspend restriction on HVM guests as it does for PV guests. As a result, your patch below is correct as a fallback safety measure, and should be taken. However the caller of modify_returncode is also at fault for attempting to resume an already-running domain. I think there needs to be a bugfix there as well. I presume that some piece of code is assuming that despite libxl-save-helper failing, xc_domain_safe() paused the guest, which is clearly not true in this case. ~Andrew > --- > > This change stops xc_domain_resume from killing my domUs on a failed > migration. I'm using a wrapper around libxl-save-helper which may fail > before libxl-save-helper is invoked, so xc_domain_save has not been > called. The idle Linux domU kernels would BUG coming out of > SCHEDOP_block in xen_safe_halt() since modify_returncode set EAX to 1. > journald was also observed to segfault. > > As written, this code treats calling xc_domain_resume on a running > domain as an error. Do we want it silently ignored? Output with this > patch looks like: > > """ > Migration failed, resuming at sender. > xc: error: Domain not in suspended state: Internal error > libxl: error: libxl.c:402:libxl__domain_resume: xc_domain_resume failed for > domain 92: Interrupted system call > """ > > libxl__domain_resume prints errno, but it is stale for this case. > xc_domain_resume_cooperative could swallow modify_returncode's error, > bypass issuing XEN_DOMCTL_resumedomain, and return success to avoid the > libxl error message. > > --- > tools/libxc/xc_resume.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c > index 18b4818..9ec6a59 100644 > --- a/tools/libxc/xc_resume.c > +++ b/tools/libxc/xc_resume.c > @@ -39,6 +39,12 @@ static int modify_returncode(xc_interface *xch, uint32_t > domid) > return -1; > } > > + if ( !info.shutdown || (info.shutdown_reason != SHUTDOWN_suspend) ) > + { > + ERROR("Domain not in suspended state"); > + return 1; > + } > + > if ( info.hvm ) > { > /* HVM guests without PV drivers have no return code to modify. */ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |