[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot





On 11/05/18 13:20, Mirela Simonovic wrote:
Hi,

On Fri, May 11, 2018 at 2:07 PM, Mirela Simonovic
<mirela.simonovic@xxxxxxxxxx> wrote:
On Fri, May 11, 2018 at 12:54 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
On 11/05/18 11:41, Mirela Simonovic wrote:
On Thu, May 10, 2018 at 6:24 PM, Dario Faggioli <dfaggioli@xxxxxxxx>
wrote:
On Thu, 2018-05-10 at 17:49 +0200, Mirela Simonovic wrote:
I am going to repeat the content of my answer to your last e-mail:

I was aware about it since the beginning. The whole point of the
conversation was we should avoid to take the decision at the lower level and
let the upper layer decide what to do.

If the system is failing today then that's fine and still fit what I said in
my first e-mail of that thread. For reminder:

"We should really avoid to use panic(...) if this is something the system
can survive. In that specific case, it would only affect the current CPU. So
it would be better to return an error and let the caller decide what to do."

To summarize:
         1) Notifiers should only report an error and let the upper caller
(here notify_cpu_starting()) deciding what to do.
         2) I am OK with the BUG_ON in notify_cpu_starting() for now.

I agree with BUG_ON in notify_cpu_starting().



Let me just clarify consequence of your proposal according to my
understanding. If instead of stopping the CPU when enabling a
capability fails the notifier returns an error, the error will
propagate to notify_cpu_starting() and BUG_ON will crash the system.
The proposal with stop_cpu() in the enable_nonboot_cpu_caps() instead
of panic that is submitted in this patch would stop only the erroneous
CPU. The rest of the system will continue to work and I though that is
what's the goal since we don't want to panic/BUG_ON.
 From that perspective I believe stop_cpu() in
enable_nonboot_cpu_caps() is better compared to returning an error
from the notifier.

You said above "I would be ok having stop_cpu called here" and I did
so (stop_cpu() in enable_nonboot_cpu_caps() instead of panic that
submitted in this patch).

My thoughts have evolved after Dario's discussion. He expressed concerned over your fix to make stop_cpu() working.

As you said I will maintain that code and this solution looks very error prone. If Stefano is happy with it, then fine.


If you believe my understanding is not correct, if I missed something
or you have another proposal please let me know.


Also, if you just want to convert panic from this patch into print I
don't believe it's a good approach, but I can do that.

I would prefer to see the notifier reporting the error with a warning and returning it.

At the notifier level it does not make sense to take the decision to stop the CPU or kill the system. This is a decision that should be taken at higher level such as in notify_cpu_starting().

The whole idea here is we have only one place taking the decision and we don't spread BUG_ON()/panic/stop_cpu everywhere. The benefit is having only one place to fix over multiple one because very likely the decision is the same everywhere.

I agree that today it will end up to crashing the system because of the BUG_ON. But that's a separate topic.

Cheers,


Thanks,
Mirela

Cheers,

--
Julien Grall

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.