[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 01/11] xen/manage: keep track of the on-going suspend mode
On Wed, May 26, 2021 at 02:29:53PM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you can confirm the sender and know the > content is safe. > > > > On 5/26/21 12:40 AM, Anchal Agarwal wrote: > > On Tue, May 25, 2021 at 06:23:35PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not > >> click links or open attachments unless you can confirm the sender and know > >> the content is safe. > >> > >> > >> > >> On 5/21/21 1:26 AM, Anchal Agarwal wrote: > >>>>> What I meant there wrt VCPU info was that VCPU info is not unregistered > >>>>> during hibernation, > >>>>> so Xen still remembers the old physical addresses for the VCPU > >>>>> information, created by the > >>>>> booting kernel. But since the hibernation kernel may have different > >>>>> physical > >>>>> addresses for VCPU info and if mismatch happens, it may cause issues > >>>>> with resume. > >>>>> During hibernation, the VCPU info register hypercall is not invoked > >>>>> again. > >>>> I still don't think that's the cause but it's certainly worth having a > >>>> look. > >>>> > >>> Hi Boris, > >>> Apologies for picking this up after last year. > >>> I did some dive deep on the above statement and that is indeed the case > >>> that's happening. > >>> I did some debugging around KASLR and hibernation using reboot mode. > >>> I observed in my debug prints that whenever vcpu_info* address for > >>> secondary vcpu assigned > >>> in xen_vcpu_setup at boot is different than what is in the image, resume > >>> gets stuck for that vcpu > >>> in bringup_cpu(). That means we have different addresses for > >>> &per_cpu(xen_vcpu_info, cpu) at boot and after > >>> control jumps into the image. > >>> > >>> I failed to get any prints after it got stuck in bringup_cpu() and > >>> I do not have an option to send a sysrq signal to the guest or rather get > >>> a kdump. > >> > >> xenctx and xen-hvmctx might be helpful. > >> > >> > >>> This change is not observed in every hibernate-resume cycle. I am not > >>> sure if this is a bug or an > >>> expected behavior. > >>> Also, I am contemplating the idea that it may be a bug in xen code > >>> getting triggered only when > >>> KASLR is enabled but I do not have substantial data to prove that. > >>> Is this a coincidence that this always happens for 1st vcpu? > >>> Moreover, since hypervisor is not aware that guest is hibernated and it > >>> looks like a regular shutdown to dom0 during reboot mode, > >>> will re-registering vcpu_info for secondary vcpu's even plausible? > >> > >> I think I am missing how this is supposed to work (maybe we've talked > >> about this but it's been many months since then). You hibernate the guest > >> and it writes the state to swap. The guest is then shut down? And what's > >> next? How do you wake it up? > >> > >> > >> -boris > >> > > To resume a guest, guest boots up as the fresh guest and then > > software_resume() > > is called which if finds a stored hibernation image, quiesces the devices > > and loads > > the memory contents from the image. The control then transfers to the > > targeted kernel. > > This further disables non boot cpus,sycore_suspend/resume callbacks are > > invoked which sets up > > the shared_info, pvclock, grant tables etc. Since the vcpu_info pointer for > > each > > non-boot cpu is already registered, the hypercall does not happen again when > > bringing up the non boot cpus. This leads to inconsistencies as pointed > > out earlier when KASLR is enabled. > > > I'd think the 'if' condition in the code fragment below should always fail > since hypervisor is creating new guest, resulting in the hypercall. Just like > in the case of save/restore. > That only fails during boot but not after the control jumps into the image. The non boot cpus are brought offline(freeze_secondary_cpus) and then online via cpu hotplug path. In that case xen_vcpu_setup doesn't invokes the hypercall again. > > Do you call xen_vcpu_info_reset() on resume? That will re-initialize > per_cpu(xen_vcpu). Maybe you need to add this to xen_syscore_resume(). > Yes coincidentally I did. It fails the registration of vcpu_info with error -22. Basically because nobody unregistered them and xen does not know that guest hibernated in first place. Moreover, syscore_resume is also called during hibernation path i.e after Image is created. Everything is resumed and thawed back before final writing of the image and then a machine shutdown. So syscore_resume can only invoke xen_vcpu_info_reset when it is actually resuming from image. I had ben able to use in_suspend variable to detect that luckily. Another line of thought is something what kexec does to come around this problem is to abuse soft_reset and issue it during syscore_resume or may be before the image get loaded. I haven't experimented with that yet as I am assuming there has to be a way to re-register vcpus during resume. Thanks, Anchal > > -boris > > > > > > Thanks, > > Anchal > >> > >>> I could definitely use some advice to debug this further. > >>> > >>> > >>> Some printk's from my debugging: > >>> > >>> At Boot: > >>> > >>> xen_vcpu_setup: xen_have_vcpu_info_placement=1 cpu=1, > >>> vcpup=0xffff9e548fa560e0, info.mfn=3996246 info.offset=224, > >>> > >>> Image Loads: > >>> It ends up in the condition: > >>> xen_vcpu_setup() > >>> { > >>> ... > >>> if (xen_hvm_domain()) { > >>> if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu)) > >>> return 0; > >>> } > >>> ... > >>> } > >>> > >>> xen_vcpu_setup: checking mfn on resume cpu=1, info.mfn=3934806 > >>> info.offset=224, &per_cpu(xen_vcpu_info, cpu)=0xffff9d7240a560e0 > >>> > >>> This is tested on c4.2xlarge [8vcpu 15GB mem] instance with 5.10 kernel > >>> running > >>> in the guest. > >>> > >>> Thanks, > >>> Anchal. > >>>> -boris > >>>> > >>>>
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |