[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot
Hi Julien, On Fri, May 11, 2018 at 12:54 PM, Julien Grall <julien.grall@xxxxxxx> wrote: > > > On 11/05/18 11:41, Mirela Simonovic wrote: >> >> Hi Dario, >> >> On Thu, May 10, 2018 at 6:24 PM, Dario Faggioli <dfaggioli@xxxxxxxx> >> wrote: >>> >>> On Thu, 2018-05-10 at 17:49 +0200, Mirela Simonovic wrote: >>>> >>>> Regardless of the fact that the notifier returns an error or not, I >>>> believe it would be good and safe to set priority and document that >>>> priority zero would cause racing issue in the scenario I debugged >>>> today. I'm pretty sure that this discussion would be forgotten soon >>>> and it really should be documented in code/comment. >>>> >>> I may very well be missing or misunderstanding something, but I >>> continue to think that the problem here is that CPU_STARTING can't, >>> right now, fail, while you need it to be able to. >>> >>> If that is the case, giving different priorities to the notifier, is >>> not a solution. >>> >> >> Let me try to clarify. The assumption is that the starting CPU can >> fail. Additional requirement set by Julien is that panic or BUG_ON is >> not acceptable. > > > Please don't twist my word. I never said it was not acceptable to have the > BUG_ON in notify_cpu_starting(). I didn't say that either. You referenced notify_cpu_starting() here, not me. BTW, if you get the impression that I'm twisting words then we have a misunderstanding and your approach is not the best way toward clarifying it. Please have that in mind next time, because I do not have such an intent and I never will. I referred to panic/BUG_ON in this scenario regardless of the place from which it would be invoked. My understanding from your previous answers is that you think the system should not panic/BUG_ON in this scenario, because this kind of error the system should be able to survive. > > I am going to repeat the content of my answer to your last e-mail: > > I was aware about it since the beginning. The whole point of the > conversation was we should avoid to take the decision at the lower level and > let the upper layer decide what to do. > > If the system is failing today then that's fine and still fit what I said in > my first e-mail of that thread. For reminder: > > "We should really avoid to use panic(...) if this is something the system > can survive. In that specific case, it would only affect the current CPU. So > it would be better to return an error and let the caller decide what to do." > > To summarize: > 1) Notifiers should only report an error and let the upper caller > (here notify_cpu_starting()) deciding what to do. > 2) I am OK with the BUG_ON in notify_cpu_starting() for now. I agree with BUG_ON in notify_cpu_starting(). > Let me just clarify consequence of your proposal according to my understanding. If instead of stopping the CPU when enabling a capability fails the notifier returns an error, the error will propagate to notify_cpu_starting() and BUG_ON will crash the system. The proposal with stop_cpu() in the enable_nonboot_cpu_caps() instead of panic that is submitted in this patch would stop only the erroneous CPU. The rest of the system will continue to work and I though that is what's the goal since we don't want to panic/BUG_ON. From that perspective I believe stop_cpu() in enable_nonboot_cpu_caps() is better compared to returning an error from the notifier. You said above "I would be ok having stop_cpu called here" and I did so (stop_cpu() in enable_nonboot_cpu_caps() instead of panic that submitted in this patch). If you believe my understanding is not correct, if I missed something or you have another proposal please let me know. Thanks, Mirela > Cheers, > > -- > Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |