[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot



On Fri, 2018-05-11 at 12:41 +0200, Mirela Simonovic wrote:
> Hi Dario,
> 
Hi,

> On Thu, May 10, 2018 at 6:24 PM, Dario Faggioli <dfaggioli@xxxxxxxx>
> wrote:
> > I may very well be missing or misunderstanding something, but I
> > continue to think that the problem here is that CPU_STARTING can't,
> > right now, fail, while you need it to be able to.
> > 
> > If that is the case, giving different priorities to the notifier,
> > is
> > not a solution.
> > 
> Let me try to clarify. The assumption is that the starting CPU can
> fail. Additional requirement set by Julien is that panic or BUG_ON is
> not acceptable. 
>
So, currently, in Xen, CPU bringup can't fail at the CPU_STARTING
phase. This is the point of the BUG_ON().

It is you that need it to change this, and make it possible for CPU
bringup to fail during CPU_STARTING. This is fine, but require changes,
which, IMO, are not limited to removing the BUG_ON() or trading it with
something else.

> There are 2 ways to deal with this scenario:
> 
> 1) Ignore and report the error, and let the erroneous CPU become
> online.
> This cannot be done without changing the logic in either scheduler or
> notify_cpu_starting(), or I least I don't see how would that be done.
> In previous email when I said "escalating this to who knows where" I
> did not refer to error escalation but the escalation of the scope of
> this series.
> 
How can that be an option? If CPU bringup failed, how is it possible to
let it go online?

To me, it's not that "we can't let a CPU for which bringup failed
continue without changing the scheduler or notify_cpu_starting()". It's
rather "we must not a CPU for which bringup faile continue. Period.".

Which is to say, you need to change things (in common code!) in such a
way that CPU bringup can fail during the CPU_STARTING phase.

> 2) Stop the erroneous CPU.
> Taking this approach requires setting the priority for the notifier.
>
No! Stop the CPU if the bringup fails duting the CPU_STARTING phase
does not require setting the priorities of the CPU_STARTING notifier. 

It requires to changing things (in common code) in such a way that CPU
bringup can fail during the CPU_STARTING phase. (Did I say that
already? :-) )

I understand that, if you set the priority, your series work. But that
does not mean that it is a proper solution to the problem. It means
that it is an hack that...well... makes your series work. :-)

> The key point is that notify_cpu_starting() and scheduler do not have
> to be changed. If errata notifier has higher priority than
> scheduler's
> notifier in the case of an error the CPU will not return into
> notify_cpu_starting() and it will never execute BUG_ON because it
> will
> be stopped. The rest of the system will continue to function without
> that CPU.
> 
Right. But now we have and architecture (ARM) for which CPU bringup can
fail at the CPU_STARTING phase. And yet, looking at common code, we see
that the CPU_STARTING notifier has a BUG_ON() if it ever fails. How
would people looking at this in 2 years time from now make sense of
this?

No, I don't think there really is a sensible workaround here. If you
need the CPU_STARTING phase of CPU bringup to be able to fail, you need
to make all the changes required to make that happen properly.

Which does involve changing common code, and which should therefore be
discussed with the other arches and core hypervisor maintainers.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.