[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 0/8] pdx: introduce a new compression algorithm
On Wed, Jul 02, 2025 at 10:54:24AM +0200, Orzel, Michal wrote: > > > On 02/07/2025 10:26, Roger Pau Monné wrote: > > On Wed, Jul 02, 2025 at 09:52:45AM +0200, Orzel, Michal wrote: > >> > >> > >> On 02/07/2025 09:00, Roger Pau Monné wrote: > >>> On Tue, Jul 01, 2025 at 01:46:19PM -0700, Stefano Stabellini wrote: > >>>> On Tue, 1 Jul 2025, Jan Beulich wrote: > >>>>> Sadly from this you omitted the output from the setup of the offsets > >>>>> arrays. Considering also your later reply, I'd be curious to know what > >>>>> mfn_to_pdx(0x50000000) is. > >>>> > >>>> Full logs here, and debug patch in attachment. > >>>> > >>>> (XEN) Checking for initrd in /chosen > >>>> (XEN) RAM: 0000000000000000 - 000000007fffffff > >>>> (XEN) RAM: 0000000800000000 - 000000087fffffff > >>>> (XEN) RAM: 0000050000000000 - 000005007fffffff > >>>> (XEN) RAM: 0000060000000000 - 000006007fffffff > >>>> (XEN) RAM: 0000070000000000 - 000007007fffffff > >>>> (XEN) > >>>> (XEN) MODULE[0]: 0000000022000000 - 0000000022172fff Xen > >>>> (XEN) MODULE[1]: 0000000022200000 - 000000002220efff Device Tree > >>>> (XEN) MODULE[2]: 0000000020400000 - 0000000021e2ffff Kernel > >>>> (XEN) RESVD[0]: 0000000000000000 - 0000000000ffffff > >>>> (XEN) RESVD[1]: 0000000001000000 - 00000000015fffff > >>>> (XEN) RESVD[2]: 0000000001600000 - 00000000017fffff > >>>> (XEN) RESVD[3]: 0000000001800000 - 00000000097fffff > >>>> (XEN) RESVD[4]: 0000000009800000 - 000000000bffffff > >>>> (XEN) RESVD[5]: 0000000011126000 - 000000001114dfff > >>>> (XEN) RESVD[6]: 000000001114e000 - 000000001214efff > >>>> (XEN) RESVD[7]: 0000000017275000 - 000000001729cfff > >>>> (XEN) RESVD[8]: 000000001729d000 - 000000001829dfff > >>>> (XEN) RESVD[9]: 000000001a7df000 - 000000001a806fff > >>>> (XEN) RESVD[10]: 000000001a807000 - 000000001b807fff > >>>> (XEN) RESVD[11]: 000000001d908000 - 000000001d92ffff > >>>> (XEN) RESVD[12]: 000000001d930000 - 000000001e930fff > >>>> (XEN) RESVD[13]: 000000001829e000 - 000000001869dfff > >>>> (XEN) RESVD[14]: 000000001869e000 - 00000000186ddfff > >>>> (XEN) RESVD[15]: 0000000800000000 - 000000083fffffff > >>>> (XEN) > >>>> (XEN) > >>>> (XEN) Command line: console=dtuart dom0_mem=2048M > >>>> console_timestamps=boot debug bootscrub=0 vwfi=native sched=null > >>>> (XEN) [00000006bfc302ec] parameter "debug" unknown! > >>>> (XEN) [00000006bfcc0476] DEBUG init_pdx 294 start=0 end=80000000 > >>>> (XEN) [00000006bfcd2400] DEBUG init_pdx 294 start=800000000 end=880000000 > >>>> (XEN) [00000006bfce29ec] DEBUG init_pdx 294 start=50000000000 > >>>> end=50080000000 > >>>> (XEN) [00000006bfcf1768] DEBUG init_pdx 294 start=60000000000 > >>>> end=60080000000 > >>>> (XEN) [00000006bfd015a4] DEBUG init_pdx 294 start=70000000000 > >>>> end=70080000000 > >>>> (XEN) [00000006bfd1444f] DEBUG setup_mm 252 > >>>> (XEN) [00000006bfd3dc6f] DEBUG setup_mm 273 start=0 size=80000000 > >>>> ram_end=80000000 directmap_base_pdx=0 > >>>> (XEN) [00000006bfd5616e] DEBUG setup_directmap_mappings 229 base_mfn=0 > >>>> nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=0 > >>>> (XEN) [00000006bfd7d38a] DEBUG setup_directmap_mappings 237 base_mfn=0 > >>>> nr_mfns=80000 directmap_base_pdx=0 > >>>> (XEN) [00000006bfd92728] DEBUG setup_mm 273 start=800000000 > >>>> size=80000000 ram_end=880000000 directmap_base_pdx=0 > >>>> (XEN) [00000006bfdaba3b] DEBUG setup_directmap_mappings 229 > >>>> base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=800000 > >>>> (XEN) [00000006bfdcd79c] DEBUG setup_directmap_mappings 237 > >>>> base_mfn=800000 nr_mfns=80000 directmap_base_pdx=0 > >>>> (XEN) [00000006bfde4d82] DEBUG setup_mm 273 start=50000000000 > >>>> size=80000000 ram_end=50080000000 directmap_base_pdx=0 > >>>> (XEN) [00000006bfdfaef0] DEBUG setup_directmap_mappings 229 > >>>> base_mfn=50000000 nr_mfns=80000 directmap_base_pdx=0 mfn_to_pdx=50000000 > >>>> (XEN) [00000006bfe35249] Assertion '(mfn_to_pdx(maddr_to_mfn(ma)) - > >>>> directmap_base_pdx) < (DIRECTMAP_SIZE >> PAGE_SHIFT)' failed at > >>>> ./arch/arm/include/asm/mmu/mm.h:72 > >>> > >>> As said on the other reply, the issue here is that with the v2 PDX > >>> offset compression logic your memory map is not compressible, and this > >>> leads to an overflow, as anything above 5TiB won't fit in the > >>> directmap AFAICT. We already discussed with Jan that ARM seems to be > >>> missing any logic to account for the max addressable page: > >>> > >>> https://lore.kernel.org/xen-devel/9074f1a6-a605-43f4-97f3-d0a626252d3f@xxxxxxxx/ > >>> > >>> x86 has setup_max_pdx() that truncates the maximum addressable MFN > >>> based on the active PDX compression and the virtual memory map > >>> restrictions. ARM needs similar logic to account for this > >>> restrictions. > >> > >> We have a few issues on Arm. First, we don't check whether direct map is > >> big > >> enough provided max_pdx that we don't set at all. Second, we don't really > >> use > >> PDX grouping (can be also used without compression). My patch (that Stefano > >> attached previously) fixes the second issue (Allejandro will take it over > >> to > >> come up with common solution). > > > > You probably can handle those as different issues, as PDX grouping is > > completely disjoint from PDX compression. It might be helpful if > > we could split the PDX grouping into a separate file from the PDX > > compression. > > > > One weirdness I've noticed with ARM is the addition of start offsets > > to the existing PDX compression, by using directmap_base_pdx, > > directmap_mfn_start, directmap_base_pdx &c. I'm not sure whether this will > > interfere with the PDX compression, but it looks like a bodge. This > > should be part of the generic PDX compression implementation, not an > > extra added on a per-arch basis. > > > > FWIW, PDX offset translation should already compress any gaps from 0 > > to the first RAM range, and hence this won't be needed (in fact it > > would just make ARM translations slower by doing an extra unneeded > > operation). My recommendation would be to move this initial offset > > compression inside the PDX mask translation. > > > >> For the first issue, we need to know max_page (at > >> the moment we calculate it in setup_mm() at the very end but we could do > >> it in > >> init_pdx() to know it ahead of setting direct map) and PDX offset (on x86 > >> there > >> is no offset). I also think that on Arm we should just panic if direct map > >> is > >> too small. > > > > Hm, that's up to the ARM folks, but my opinion is that you should > > simply ignore memory above the threshold. Panicking should IMO be a > > last resort option when there's no way to workaround the issue. > On Arm we handle user errors and suspicious behavior usually as panics as > oppose > to x86 which is more liberal in that regard. We want to fail as soon as > possible. > > > > >> The issue can be reproduced by disabling PDX compression, so not only with > >> Roger's patch. > >> > >> @Julien, I'm thinking of something like this: > >> > >> diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c > >> index 4d22f35618aa..e6d9b49acd3c 100644 > >> --- a/xen/arch/arm/arm32/mmu/mm.c > >> +++ b/xen/arch/arm/arm32/mmu/mm.c > >> @@ -190,7 +190,6 @@ void __init setup_mm(void) > >> > >> /* Frame table covers all of RAM region, including holes */ > >> setup_frametable_mappings(ram_start, ram_end); > >> - max_page = PFN_DOWN(ram_end); > >> > >> /* > >> * The allocators may need to use map_domain_page() (such as for > >> diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c > >> index a0a2dd8cc762..3e64be6ae664 100644 > >> --- a/xen/arch/arm/arm64/mmu/mm.c > >> +++ b/xen/arch/arm/arm64/mmu/mm.c > >> @@ -224,6 +224,9 @@ static void __init setup_directmap_mappings(unsigned > >> long > >> base_mfn, > >> */ > >> directmap_virt_start = DIRECTMAP_VIRT_START + > >> (base_mfn - mfn_gb) * PAGE_SIZE; > >> + > >> + if ( (max_pdx - directmap_base_pdx) > (DIRECTMAP_SIZE >> > >> PAGE_SHIFT) ) > >> + panic("Direct map is too small\n"); > > > > As said above - I would avoid propagating the usage of those offsets > > into generic memory management code, it's usage should be confined > > inside the translation functions. > directmap_base_pdx is set a few lines above, so I would not call it > propagation. > > > > > Here you probably want to use maddr_to_virt() or similar. > I can't because maddr_to_virt() has the ASSERT with similar check. > > > > You can maybe pickup: > > > > https://lore.kernel.org/xen-devel/20250611171636.5674-3-roger.pau@xxxxxxxxxx/ > > > > And attempt to hook it into ARM? > As said above, we have different ways to approach setting max_pdx. On Arm we > want to panic, on x86 you want to limit the max_pdx. > > > > > I don't think it would that difficult to reduce the consumption of > > memory map ranges to what Xen can handle. > > > > Thanks, Roger. > > The diff I sent fixes the issue for direct map now. We can take it now if we > want to solve the issue. If we instead want to wait for frametable fixes (\wrt > grouping) and possible PDX changes (making offsets common) to be done first, I > can simply park this patch. No please, don't park it just because of my opinions. I think Julien is OK with it, so don't hold back because of my x86 based opinion on how to handle errors. Regards, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |