[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: Don't call panic if ARM TF cpu off returns DENIED



Hi,

On 17/06/2022 10:10, Volodymyr Babchuk wrote:
Julien Grall <julien@xxxxxxx> writes:

On 16/06/2022 19:40, Volodymyr Babchuk wrote:
Hi Julien,

Hi Volodymyr,

Julien Grall <julien@xxxxxxx> writes:

Hi,

On 16/06/2022 14:55, dmitry.semenets@xxxxxxxxx wrote:
From: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
According to PSCI specification ARM TF can return DENIED on CPU OFF.

I am confused. The spec is talking about Trusted OS and not
firmware. The docummentation is also not specific to ARM Trusted
Firmware. So did you mean "Trusted OS"?
It should be "firmware", I believe.

Hmmm... I couldn't find a reference in the spec suggesting that
CPU_OFF could return DENIED because of the firmware. Do you have a
pointer to the spec?

Ah, looks like we are talking about different things. Indeed, CPU_OFF
can return DENIED only because of Trusted OS. But entity that *returns*
the error to a caller is a firmware.

Right, the interesting part is *why* DENIED is returned not *who* returns it.
Refer to "Arm Power State Coordination Interface (DEN0022D.b)"
section 5.5.2

Reading both 5.5.2 and 5.9.1 together, DENIED would be returned when
the trusted OS can only run on one core.

Some of the trusted OS are migratable. So I think we should first
attempt to migrate the CPU. Then if it doesn't work, we should prevent
the CPU to go offline.

That said, upstream doesn't support cpu offlining (I don't know for
your use case). In case of shutdown, it is not necessary to offline
the CPU, so we could avoid to call CPU_OFF on all CPUs but
one. Something like:

This is even better approach yes. But you mentioned CPU_OFF. Did you
mean SYSTEM_RESET?

By CPU_OFF I was referring to the fact that Xen will issue the call
all CPUs but one. The remaining CPU will issue the command to
reset/shutdown the system.


I just want to clarify: change that you suggested removes call to
stop_cpu() in halt_this_cpu(). So no CPU_OFF will be sent at all.

I was describing the existing behavior.


All CPUs except one will spin in

     while ( 1 )
         wfi();

while last cpu will issue SYSTEM_OFF or SYSTEM_RESET.

Is this correct?

Yes.


   void machine_halt(void)
@@ -21,10 +23,6 @@ void machine_halt(void)
       smp_call_function(halt_this_cpu, NULL, 0);
       local_irq_disable();

-    /* Wait at most another 10ms for all other CPUs to go offline. */
-    while ( (num_online_cpus() > 1) && (timeout-- > 0) )
-        mdelay(1);
-
       /* This is mainly for PSCI-0.2, which does not return if success. */
       call_psci_system_off();

Signed-off-by: Dmytro Semenets <dmytro_semenets@xxxxxxxx>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>

I don't recall to see patch on the ML recently for this. So is this an
internal review?
Yeah, sorry about that. Dmytro is a new member in our team and he is
not
yet familiar with differences in internal reviews and reviews in ML.

No worries. I usually classify internal review anything that was done
privately. This looks to be a public review, althought not on
xen-devel.

I understand that not all some of the patches are still in PoC stage
and doing the review on your github is a good idea. But for those are
meant to be for upstream (e.g. bug fixes, small patches), I would
suggest to do the review on xen-devel directly.

It not always clear if patch is eligible for upstream. At first we
thought that problem is platform-specific and we weren't sure that we
will find a proper upstreamable fix.

You can guess but not be sure until you send it to upstrema :). In fact,...

Probably you saw that PR's name
quite differs from final patch. This is because initial solution was
completely different from final one.

... even before looking at your PR, this was the first solution I had in mind. I am still pondering whether this could be the best approach because I have the suspicion there might be some platform out relying on receiving the shutdown request on CPU0.

Anyway, this is so far just theorical, my proposal should solve your problem.

On a separate topic, the community is aiming to support a wide range of platforms out-of-the-box. I think platform-specific patches are acceptable so long they are self-contained (to some extend. I.e if you ask to support Xen on RPI3 then I would still probably argue against :)) or have a limited impact to the rest of the users (this is why we have alternative in Xen).

My point here is your initial solution may have been the preferred approach for upstream. So if you involve the community early, you are reducing the risk to have to backtrack and/or spend extra time in the wrong directions.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.