[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] 3.4.70+ kernel WARNING spew dysfunction on failed migration
create ^ title it libxl should implement non-suspend-cancel based resume path owner Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> thanks To summarise what I just said to Ian J in the corridor (and lets have a bug to record it): There are two mechanisms by which a suspend can be aborted and the original domain resumed. The older method is that the toolstack resets a bunch of state (see tools/python/xen/xend/XendDomainInfo.py resumeDomain) and then restarts the domain. The domain will see HYPERVISOR_suspend return 0 and will continue without any realisation that it is actually running in the original domain and not in a new one. This method is supposed to be implemented by libxl_domain_resume(suspend_cancel=0) but it is not. The other method is newer and in this case the toolstack arranges that HYPERVISOR_suspend returns 1 and restarts it (I beleiv . The domain will observe this and realise that it has been restarted in the same domain and will behave accordingly. This method is implemented, correctly AFAIK, by libxl_domain_resume(suspend_cancel=1). However the newer method is not available in all kernels, although it does date from the Linux 2.6.18 days and is implemented in all Linux pvops kernels I can't speak for others (e.g. BSD). The toolstack is supposed to check for the XEN_ELFNOTE_SUSPEND_CANCEL ELF note when building the domain. The presence/absence of this flag needs to be remembered so that it can be consulted on resume (this also implies preserving that knowledge over migration). xl currently uses libxl_domain_resume(suspend_cancel=0) on migration failure which as it stands won't work for *any* domain. Arguably switching to suspend_cancel=1 for now will mean that some subset of kernels will work, and those which don't will not have regressed, until we can correctly implement the suspend_cancel=0 and the necessary tracking of XEN_ELFNOTE_SUSPEND_CANCEL. I've also just noticed that on failure to save (as opposed to migrate) xl does use suspend_cancel=1. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |