[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen4.2 S3 regression?
On 20/09/2012 21:30, "Ben Guthro" <ben@xxxxxxxxxx> wrote: > (XEN) Bringing CPU1 down > (XEN) Disabling CPU1 > (XEN) Disabled CPU1 > (XEN) play_dead: CPU1 > (XEN) cpu_exit_clear: CPU1 > (XEN) cpu_uninit: CPU1 > (XEN) CPU1 dead So CPU1 is taken down properly, apparently... > (XEN) Entering ACPI S3 state. ... During S3 suspend. > (XEN) Finishing wakeup from ACPI S3 state. > (XEN) Enabling non-boot CPUs Â... > (XEN) Bringing CPU1 up > (XEN) Setting warm reset code and vector. > (XEN) Asserting INIT. > (XEN) Waiting for send to finish... > (XEN) +Deasserting INIT. > (XEN) Waiting for send to finish... > (XEN) +#startup loops: 2. > (XEN) Sending STARTUP #1. > (XEN) After apic_write. > (XEN) CPU#1 already initialized! But here CPU1 thinks it is already initialised! *This* is the bug you need to go look at. CPU1 will spin at this point... > (XEN) Startup point 1. > (XEN) Waiting for send to finish... > (XEN) +Sending STARTUP #2. > (XEN) After apic_write. > (XEN) Startup point 1. > (XEN) Waiting for send to finish... > (XEN) +After Startup. > (XEN) After Callout 1. > (XEN) Stuck ?? ...Causing CPU0 to think CPU1 is stuck (which is fair, because it is). > (XEN) cpu_exit_clear: CPU1 > (XEN) cpu_uninit: CPU1 > (XEN) __cpu_up - do_boot_cpu error > (XEN) cpu_up CPU1 CPU not up > (XEN) cpu_up CPU1 fail > (XEN) Error taking CPU1 up: -5 > [  32.780055] ACPI: Low-level resume complete > [  32.780055] PM: Restoring platform NVS memory > [  32.780055] Enabling non-boot CPUs ... > > then it crashes. > > It seems that it is always falling through into the "else" clause of > theÂdo_boot_cpu() function when attempting to bring it back up, seemingly > stuck inÂCPU_STATE_CALLOUT > > Any ideas as to what might be causing it to get stuck in that state? Yes, see explanation above, which is actually the same explanation I gave you before. You need to go investigate why CPU1 is getting confused in cpu_init(). -- Keir > > > >  >> I put a cpu id conditional BUG() call in there, to verify - and while it is >> reached when using >> xen-hptool cpu-offline 1 >> It never seems to be reached from the S3 path. >> >> >> What is the expected call chain to get into this code during S3? >> >> >> On Thu, Sep 20, 2012 at 4:03 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>>> On 20.09.12 at 08:13, Keir Fraser <keir.xen@xxxxxxxxx> wrote: >>>> CPU#1 got stuck in loop in cpu_init() as it appears to be Åalready >>>> initialised in cpu_initialized bitmap. CPU#0 detects it is stuck and >>>> carries on, but the resume code assumes all CPUs are brought back online >>>> and >>>> crashes later. >>> >>> So this would suggest play_dead() (-> cpu_exit_clear() -> >>> cpu_uninit()) not getting reached during the suspend cycle. >>> That should be fairly easy to verify, as the serial console >>> ought to still work when the secondary CPUs get offlined. >>> >>> That might imply cpumask_clear_cpu(cpu, &cpu_online_map) >>> not getting reached in __cpu_disable(), which would be in line >>> with the observation that none of the logs provided so far >>> showed anything being done by fixup_irqs() (called right >>> after clearing the online bit). >>> >>> Jan >> > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |