[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Only CPU0 active after ACPI S3, xen 4.1.3



Sadly, the full revert of this changeset in Xen-4.2 did not solve the resume issue I am seeing on the Lenovo T430.

My current suspicion is irq delivery, because of the following messages I see on the console on the way down:

(XEN) Preparing system for ACPI S3 state.
(XEN) Disabling non-boot CPUs ...
(XEN) Broke affinity for irq 1
(XEN) Broke affinity for irq 9
(XEN) Broke affinity for irq 12
(XEN) Broke affinity for irq 26
(XEN) Broke affinity for irq 30
(XEN) Broke affinity for irq 1
(XEN) Broke affinity for irq 1
(XEN) Entering ACPI S3 state.

I am not currently running with pinning dom0 vcpus

Of course, this is merely a suspicion, and I don't have a lot of hard evidence to back this up.

I've requested an ITP be budgeted to debug these issues on Intel SDPs, but I think it may be some months before I see the results of that.


Jan - any suggestions on how to procede with this? FWIW, Xen 4.0.y suspends on this machine reliably.


/btg

On Sun, Dec 23, 2012 at 8:45 AM, Ben Guthro <ben.guthro@xxxxxxxxx> wrote:
Interesting.

I had started by reverting the commit entirely, but settled on only reverting the part causing the scheduling issues.
I'm not sure if I was as thorough in my testing this fix, across a lot of laptop generations.

I'll test reverting the full commit in the new year, and report back.

I think that, at a minimum - the commit should get some scrutiny by people who might understand the subtleties, and/or unintended side effects better than I.


-Ben


On Sat, Dec 22, 2012 at 9:49 PM, Marek Marczykowski <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
On 21.12.2012 17:18, Ben Guthro wrote:
> On Dec 21, 2012, at 11:03 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>
>>>>> On 21.12.12 at 16:30, Marek Marczykowski <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Next bisection (this time with sched_ratelimit_us=0) gives this commit:
>>> http://xenbits.xen.org/hg/xen-4.1-testing.hg/rev/d67e4d12723f
>>
>> Ben, wasn't this where your bisection ended up too?
>
> Yes, for the dom0_pin_vcpus issue.

Ok, I can confirm that on xen-4.1-testing tip with above commit reverted
completely problem has gone away, even without sched_ratelimit_us=0. With
Ben's patch (partially revert) no reboot observed, but still sometimes only
pCPU0 is used after resume.

--
Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.