[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
Hi, Konrad, any update on this S3 problem you're seeing? I just got a chance to give a try on my Dell core-i7 platform with a Ubuntu 10.10 system. Xen version is: changeset: 23632:33717472f37e tag: tip user: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> date: Tue Jun 28 18:15:44 2011 +0100 summary: libxc: Squash xc_e820.h (and delete) into xenctrl.h for dom0 I use origin/master plus ACPI patches queued on your origin/devel/acpi-s3.v0: commit 4aa69dc48e031276b4d771dcb227d553fd3def0b Merge: df5b2b6 9f90a3b Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> Date: Tue Jun 21 09:34:31 2011 -0400 Merge branch '3.0-rc1-rem_pg_reserve-4' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm w/ or w/o ACPI processor patches on my box ACPI S3 just works well. Thanks Kevin > From: Tian, Kevin > Sent: Tuesday, June 21, 2011 7:22 AM > > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx] > > Sent: Monday, June 20, 2011 8:36 PM > > > > > ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is > > > for > > Cx/Px. > > > > Right.. > > > > > > > > > > > (which is in the devel/acpi-s3.v0 branch). > > > > > > > > the hypervisor, after an S3 resume sits forever in the default_idle. The > > > > Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall. > > > > > > > > http://darnok.org/xen/devel.acpi-s3.v1.serial.log > > > > > > > > If that patch above is present and I've cpufreq=xen on the Xen > > > > hypervisor then Linux kernel gets unstuck and returns to userspace: > > > > > > > > http://darnok.org/xen/devel.acpi-s3.v0.serial.log > > > > > > Compare your logs, the major difference is: > > > > > > [ 168.754739] calling i2c-8+ @ 3096 > > > [ 168.758200] call i2c-8+ returned 0 after 0 usecs > > > <<< 1st case stuck here > > > [ 168.762882] calling card0-VGA-1+ @ 3096 > > > [ 168.766867] call card0-VGA-1+ returned 0 after 0 usecs > > > [ 168.772085] calling ttm+ @ 3096 > > > [ 168.775360] call ttm+ returned 0 after 0 usecs > > > [ 168.779870] PM: resume of devices complete after 13117.603 msecs > > > [ 168.786006] PM: Finishing wakeup. > > > <<<2nd case forward progress > > > > > > It looks that VGA card resume has some problem on resume, which then > > > > In both cases - with the patch and without.. > > that's expected since device suspend is always invoked in the S3 path. > > > > > > makes dom0 stay in idle loop and thus block hypercall, and then due to > > > no runnable vcpu so Xen most time in idle_loop too. In earlier log > > > there're > > > some stack trace in i915 driver. Perhaps you can try a different machine > > > > Or remove the i915 just to eliminate that. > > So any result there? :-) > > > > or try native S3 on same box to make sure it's not mixed with native > > > issues. > > > > > > > > > > > (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and > > > > have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still > > > > gets stuck). > > > > > > > > I figured that the primary reason the guest is allowed to > > > > exit is SCHEDOP_block loop is b/c the pm_idle call is set to the > > > > acp_processor_idle which does "something" extra after the machine > comes > > > > out of a S3 suspend. > > > > > > If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in > > dom0 > > > before incorporating Xen specific version (the patch you tried). We don't > want > > > dom0 to play with Cx directly b/c it's the responsibility of Xen. > > > > Huh? You misunderstood me. The 'acpi_processor_idle' is the hypervisor's > > idle loop. It can be running inside of that one, or the 'default_idle' > > loop. Hence > > running inside which one? I'd think only default_idle invokes it when current > cpu > is actually idle. > > > my question why would that specific hypervisor idle loop make dom0 run > nicely > > while the default one would not. > > this is counterintuitive to me honestly speaking. I'd more think that > acpi_processor_idle may cause some issue than pure "sti;hlt" because acpi > version has more logic to handle. In earlier day when it's still in > stabilization > phase, we did observe some non-exit case from deep Cstate but this never > happens on pure hlt. > > IOW, I don't take this idle path as a necessary step to make S3 resume > working, > which is simply related when the cpu has nothing to do... > > > > > In dom0, irregardless of the patches, the 'default_idle' is run which makes > > the > > xen_safe_halt paravirt call. > > OK, that matches my expectation then. > > > > > > > > > Of course we still need figure out why same issues occur with cpuidle=0/ > > > cpufreq=none, which however can be revisited after the basic S3 works. :-) > > > > Right. The end result of those parameters is that the 'default_idle' in the > > hypervisor is choosen instead of the 'acpi_processor_idle' one. > > > > > > > > > > > Any ideas? > > > > > > No other ideas for now. From historical view Xen S3 was supported before > > > > Hmm, I am actually tempted to start commenting out code in the > > acpi_processor_idle > > and seeing what will cause it to have the same failure as 'default_idle'. > > you can also try "max_cstates=1" to see any difference, which is expected to > has similar effect as safe_halt(). > > Thanks > Kevin > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |