[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/5] x86/idle: re-arrange dead-idle handling



>>> On 05.12.18 at 21:33, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 10/09/2018 11:13, Jan Beulich wrote:
>>
>>> Equally, it may still be able to service #MC's, so I can't see how it is
>>> safe for us to ever free the percpu data.
>> I'm having trouble seeing how this remark relates to the series here.
> 
> Because you've tried to make NMIs safe, but not made equivalent
> adjustments to the #MC side of things.

Explain to me how this is getting worse with the patch in question.
It doesn't alter under what conditions per-CPU data gets freed.
Of course I can short-circuit the #MC handler just like I do for the
NMI one, but that's only going to delay shutdown of the core
until a second #MC surfaces (as the first one would never get
dealt with).

>> Plus it's a theoretical problem at present only anyway:
>> - physical hot remove is not implemented (there's no source of the
>>   new CPU_REMOVE notification),
>> - Intel CPUs get parked, i.e. never have their per-CPU data freed,
>> - AMD CPUs don't broadcast #MC.
> 
> Ignoring MCE's is never an option, but whenever CR4.MCE is set, we must
> be prepared to handle #MC.  Just because an AMD CPU is playing dead
> doesn't mean it is immune to receiving #MC's.

Again - this is nothing the patch here changes in any way. It's
not clear to me whether clearing CR4.MCE is an option on AMD
CPUs.

Anything beyond wiring #MC into trap_nop() (please let me
know if that's what you want to see added) should imo not
be part of this patch, and again imo doesn't even have to be
part of this series.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.