[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/arm64: Use __flush_dcache_area instead of __flush_dcache_all
On Tue, Oct 7, 2014 at 3:40 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote: > On Tue, Oct 07, 2014 at 05:15:58AM +0100, Roy Franz wrote: >> On Mon, Oct 6, 2014 at 9:28 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote: >> > Hi Suravee, >> > >> > On Mon, Oct 06, 2014 at 04:49:10PM +0100, suravee.suthikulpanit@xxxxxxx >> > wrote: >> >> From: Suravee Suthikulpanit <Suravee.Suthikulpanit@xxxxxxx> >> >> >> >> when booting with EFI, __flush_dcache_all does not correctly flush data. >> >> >> >> According to Mark Rutland, __flush_dcache_all does not guaranteed to push >> >> data to the PoC if there is a system-level cache as it uses Set/Way >> >> operations. >> > >> > A better way to look at this is that Set/Way operations are never >> > guaranteed to flush data to the PoC, regardless of the presence of a >> > system-level cache. They might on certain implementations, but that's >> > not an architectural guarantee. The same caveat applies to using them to >> > push data to other points in the cache hierarchy (PoUU or PoUIS). >> > >> > Generally, Set/Way cache maintenance operations can only be used to >> > empty or clean the architected caches visible to a given CPU, and only >> > when all masters sharing those caches have been prevented from >> > allocating any cache entries. Outside of IMPLEMENTATION DEFINED >> > power-down sequences or reset-like operations they are typically the >> > wrong thing to use. >> > >> > So any other uses of Set/Way operations should also be treated as >> > suspect, and are likely to be problematic on platforms with system-level >> > caches. >> >> So what all do we need to flush? Do we need to flush all modified >> (dirty) cache lines, >> or just a specific subset? > > You need to flush anything which needs to be visible at the PoC. So > anything that needs to be accessible with the caches disabled needs to > be flushed. You also need to clean the range corresponding to anywhere > you intend to write to with the caches disabled. > >> In Linux the FDT which is modified in the Linux EFI stub isn't >> flushed, nor is the EFI memory map, >> both of which are modified by the UEFI firmware/boot stub. I feel >> like I'm missing >> something here. > > Within Linux we're getting lucky here because those accesses are all > done with the caches enabled, and we don't make any conflicting accesses > while the caches are disabled -- once we turn the caches back on the > data is visible again. > > There's a possible problem with mismatched aliases here, as UEFI could > have had cacheable mappings for any arbitrary subset of the physical > address space that might not match what we want to use. So far we > haven't encountered any because the memory attributes used by UEFI > happen to match that used by the kernel. It seems that for Xen we do need to flush the FDT as well - I get a variety of crashes with a corrupt FDT when cache state is modeled on the FVP model, and Suravee sees similar behavior on Seattle. I was not expecting this, as I looked at the code in Xen and the caches/TLB are enabled quite early on, before the FDT is accessed by Xen. I then looked at the mappings used by edk2 and Xen, and found some differences. Even after modifying edk2 to use the same configuration as Xen, the flushing of the FDT is still required. Xen and edk2 use the same memory attributes in the MAIR_EL2 register (0xFF), but had different sharing, access perm, and nG settings. The flushing of the FDT seems to be required, but I'm not sure why. Does linux access the FDT with the same flat mapping used by edk2? I think that Xen uses a different virtual mapping, so I suppose this could cause problems with a virtually tagged cache. (I couldn't find a description of that detail regarding the caches.) I'd really like to understand why this flush is required for Xen, and to make sure there there isn't other internal edk2 state that would also need flushing. > > In the absence of a system cache we could just nuke the cache hierarchy > by set/way to prevent that so long as we know no masters are allocating > entries while we do so. With a system cache it would be possible to nuke > the cache hierarchy by VA, but for the sizeable quantities of RAM we > expect that's not likely to be feasible. > >> >> Therefore, this patch switchs to use the "__flush_dcache_area" >> > >> > Nit: s/switchs/switches/ >> > >> >> mechanism, which is coppied from Linux. >> > >> > It would be good to state that this uses maintenance by VA, which (sane) >> > system caches should respect. >> > >> >> >> >> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@xxxxxxx> >> >> --- >> >> >> >> NOTE: I still have not fully boot into Dom0 with this patch. >> >> However, it seems that the data is flushed out to physical >> >> memory now. >> >> >> >> xen/arch/arm/arm64/cache.S | 32 ++++++++++++++++++++++++++++++++ >> >> xen/arch/arm/arm64/head.S | 24 +++++++++++++++++++----- >> >> 2 files changed, 51 insertions(+), 5 deletions(-) >> >> >> >> diff --git a/xen/arch/arm/arm64/cache.S b/xen/arch/arm/arm64/cache.S >> >> index a445cbf..38f96c2 100644 >> >> --- a/xen/arch/arm/arm64/cache.S >> >> +++ b/xen/arch/arm/arm64/cache.S >> >> @@ -97,3 +97,35 @@ finished: >> >> isb >> >> ret >> >> ENDPROC(__flush_dcache_all) >> >> + >> >> +/* >> >> + * dcache_line_size - get the minimum D-cache line size from the CTR >> >> register. >> >> + */ >> >> + .macro dcache_line_size, reg, tmp >> >> + mrs \tmp, ctr_el0 // read CTR >> >> + ubfm \tmp, \tmp, #16, #19 // cache line size encoding >> >> + mov \reg, #4 // bytes per word >> >> + lsl \reg, \reg, \tmp // actual cache line size >> >> + .endm >> >> + >> >> +/* >> >> + * __flush_dcache_area(kaddr, size) >> >> + * >> >> + * Ensure that the data held in the page kaddr is written back to the >> >> + * page in question. >> >> + * >> >> + * - kaddr - kernel address >> >> + * - size - size in question >> >> + */ >> >> +ENTRY(__flush_dcache_area) >> >> + dcache_line_size x2, x3 >> >> + add x1, x0, x1 >> >> + sub x3, x2, #1 >> >> + bic x0, x0, x3 >> >> +1: dc civac, x0 // clean & invalidate D >> >> line / unified line >> >> + add x0, x0, x2 >> >> + cmp x0, x1 >> >> + b.lo 1b >> >> + dsb sy >> >> + ret >> >> +ENDPROC(__flush_dcache_area) >> >> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S >> >> index 7650abe..704f39d 100644 >> >> --- a/xen/arch/arm/arm64/head.S >> >> +++ b/xen/arch/arm/arm64/head.S >> >> @@ -740,16 +740,30 @@ ENTRY(lookup_processor_type) >> >> */ >> >> ENTRY(efi_xen_start) >> >> /* >> >> + * Preserve x0 (fdf pointer) across call to __flush_dcache_area, >> > >> > Sorry if this is a silly question, but what's the "fdf pointer"? >> > >> >> Should be fdt. This is a typo from my original patch. > > Ok. > >> Also, we should remove flush_dcache_all, as that was added for use in >> the EFI boot code. If we >> don't use it there it doesn't have a user in Xen. > > That sounds like a good idea to me. > >> >> + * restore for entry into Xen. >> >> + */ >> >> + mov x20, x0 >> >> + >> >> + /* >> >> + * Flush dcache covering current runtime addresses >> >> + * of xen text/data. Then flush all of icache. >> >> + */ >> >> + adrp x1, _start >> >> + add x1, x1, #:lo12:_start >> >> + adrp x2, _end >> >> + add x2, x2, #:lo12:_end >> >> + sub x1, x2, x1 >> > >> > Shouldn't the start address go in x0? We saved the fdf pointer earlier >> > but never placed the start address into x0. >> >> Yes, this does seem to be missing >> > >> > I take it Xen doesn't relocate itself? >> >> Xen does relocate itself, but that is done later in the boot process >> that is common between the EFI and Image >> boot methods. > > Ah, ok. > > Thanks, > Mark. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |