[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround





On 26/11/2020 11:27, Wei Chen wrote:
Hi Julien,

Hi Wei,

-----Original Message-----
From: Julien Grall <julien@xxxxxxx>
Sent: 2020年11月26日 18:55
To: Wei Chen <Wei.Chen@xxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>
Cc: Penny Zheng <Penny.Zheng@xxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx;
Andre Przywara <Andre.Przywara@xxxxxxx>; Bertrand Marquis
<Bertrand.Marquis@xxxxxxx>; Kaly Xin <Kaly.Xin@xxxxxxx>; nd
<nd@xxxxxxx>
Subject: Re: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround

Hi Wei,

Your e-mail font seems to be different to the usual plain text one. Are
you sending the e-mail using HTML by any chance?


It's strange, I always use the plain text.

Maybe exchange decided to mangle the e-mail :). Anyway, this new message looks fine.


On 26/11/2020 02:07, Wei Chen wrote:
Hi Stefano,

-----Original Message-----
From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Sent: 2020��11��26�� 8:00
To: Wei Chen <Wei.Chen@xxxxxxx>
Cc: Julien Grall <julien@xxxxxxx>; Penny Zheng <Penny.Zheng@xxxxxxx>;
xen-
devel@xxxxxxxxxxxxxxxxxxxx; sstabellini@xxxxxxxxxx; Andre Przywara
<Andre.Przywara@xxxxxxx>; Bertrand Marquis
<Bertrand.Marquis@xxxxxxx>;
Kaly Xin <Kaly.Xin@xxxxxxx>; nd <nd@xxxxxxx>
Subject: RE: [PATCH] xen/arm: Add Cortex-A73 erratum 858921 workaround

Resuming this old thread.

On Fri, 13 Nov 2020, Wei Chen wrote:
Hi,

On 09/11/2020 08:21, Penny Zheng wrote:
CNTVCT_EL0 or CNTPCT_EL0 counter read in Cortex-A73 (all versions)
might return a wrong value when the counter crosses a 32bit boundary.

Until now, there is no case for Xen itself to access CNTVCT_EL0,
and it also should be the Guest OS's responsibility to deal with
this part.

But for CNTPCT, there exists several cases in Xen involving reading
CNTPCT, so a possible workaround is that performing the read twice,
and to return one or the other depending on whether a transition has
taken place.

Signed-off-by: Penny Zheng <penny.zheng@xxxxxxx>

Acked-by: Julien Grall <jgrall@xxxxxxxxxx>

On a related topic, do we need a fix similar to Linux commit
75a19a0202db "arm64: arch_timer: Ensure counter register reads occur
with seqlock held"?


I take a look at this Linux commit, it seems to prevent the seq-lock to be
speculated.  Using an enforce ordering instead of ISB after the read counter
operation seems to be for performance reasons.

I have found that you had placed an ISB before read counter in get_cycles
to prevent counter value to be speculated. But you haven't placed the
second
ISB after reading. Is it because we haven't used the get_cycles in seq lock
critical context (Maybe I didn't find the right place)? So should we need to
fix it
now, or you prefer to fix it now for future usage?

Looking at the call sites, it doesn't look like we need any ISB after
get_cycles as it is not used in any critical context. There is also a
data dependency with the value returned by it.

I am assuming you looked at all the users of NOW(). Is that right?


So I am thinking we don't need any fix. At most we need an in-code comment?

I agree with you to add an in-code comment. It will remind us in future when
we
use the get_cycles in critical context. Adding it now will probably only lead to
meaningless performance degradation.

I read this as there would be no perfomance impact if we add the
ordering it now. Did you intend to say?

Sorry about my English. I intended to say "Adding it now may introduce some
performance cost. And this performance cost may be not worth. Because Xen
may never use it in a similar scenario "

Don't worry! I think the performance should not be noticeable if we use the same trick as Linux.

In addition, AFAICT, the x86 version of get_cycles() is already able to
provide that ordering. So there are chances that code may rely on it.

While I don't necessarily agree to add barriers everywhere by default
(this may have big impact on the platform). I think it is better to have
an accurate number of cycles.


As x86 had done it, I think it’s ok to do it for Arm. This will keep a function
behaves the same on different architectures.

Just to be clear, I am not 100% sure this is what Intel is doing. Although this is my understanding of the comment in the code.

@Stefano, what do you think?

@Wei, assuming Stefano is happy with the proposal, would you be happy to send a patch for that?

Best regards,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.