[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [patch 0/3]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs

On 23/12/2008 08:40, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:

>> As for moving *cmci_owner_set* out of stopmachine_run is basically ok for us.
>> Just one thing: 
>> CMCI might happen and lost during the very small window (old owner is cleared
>> while new owner is not set). In order to make sure that CMCI could be
>> triggered an on the new owner, we need to clear MSR Bank(i) status register
>> [Corrected Error Counter] field ( We normally do this @ CMCI interrupt
>> handler, according to spec, if the counter is not cleared, CMCI will not be
>> triggered any more).
>> I made a small patch for it in the attachment. How do you think?
> I don't know very much about CMCI. If you think this is required I will
> certainly check it in.

Actually I think this is a good idea, even if we'd stayed with your original
CMCI patches. I will apply it.

One thing -- if you want to reduce the window between release of a band by
its old owner and acquisition by a new owner, we could do the whole lot
before stop_machine_run()? Maybe cmci_cpu_down(cpu) which would IPI 'cpu' to
clear its CMCI state and then IPI all other CPUs to pick up the released
banks. This would be neatly hooked off CPU_DOWN_PREPARE or similar in Linux,
but Xen doesn't have cpu notifiers. :-) You'd have to call cmci_cpu_down()
explicitly in cpu_down(). Or perhaps we should have cpu notifier chains in
Xen too...

If we do the above I don't think we need to re-introduce your rollback
logic. If you think about it, there's no reason to prefer the old owner over
the new owner, so no reason to roll back. I believe?

 -- Keir

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.