[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split



On 02/10/2011 07:42 AM, Juergen Gross wrote:
On 02/09/11 15:21, Juergen Gross wrote:
Andre, George,


What seems to be interesting: I think the problem did always occur when
a new cpupool was created and the first cpu was moved to it.

I think my previous assumption regarding the master_ticker was not too bad.
I think somehow the master_ticker of the new cpupool is becoming active
before the scheduler is really initialized properly. This could happen, if
enough time is spent between alloc_pdata for the cpu to be moved and the
critical section in schedule_cpu_switch().

The solution should be to activate the timers only if the scheduler is
ready for them.

George, do you think the master_ticker should be stopped in suspend_ticker
as well? I still see potential problems for entering deep C-States. I think
I'll prepare a patch which will keep the master_ticker active for the
C-State case and migrate it for the schedule_cpu_switch() case.

Okay, here is a patch for this. It ran on my 4-core machine without any
problems.
Andre, could you give it a try?
Did, but unfortunately it crashed as always. Tried twice and made sure I booted the right kernel. Sorry. The idea with the race between the timer and the state changing sounded very appealing, actually that was suspicious to me from the beginning.

I will add some code to dump the state of all cpupools to the BUG_ON to see in which situation we are when the bug triggers.

Regards,
Andre.

--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.