[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/3] xen/riscv: introduce setup_initial_pages
Hi Julien, On Wed, 2023-03-22 at 14:21 +0000, Julien Grall wrote: ... > > > > > + unsigned long page_addr; > > > > + > > > > + // /* align start addresses */ > > > > + // map_start &= XEN_PT_LEVEL_MAP_MASK(0); > > > > + // pa_start &= XEN_PT_LEVEL_MAP_MASK(0); > > > > > > They should be switched to ASSERT() or BUG_ON(). > > Sure. Thanks. I'll update. > > > > > > > + > > > > + page_addr = map_start; > > > > + while ( page_addr < map_end ) > > > > > > This loop is continue to assume that only the mapping can first > > > in > > > 2MB > > > section (or less if the start is not 2MB aligned). > > > > > > I am OK if you want to assume it, but I think this should be > > > documented > > > (with words and ASSERT()/BUG_ON()) to avoid any mistake. > > I add a check in setup_initial_pagetables: > > BUG_ON(load_addr_start % (PAGE_ENTRIES * PAGE_SIZE) != 0); > > Probably this is not a correct place and should be moved to > > setup_initial_mapping() instead of setup_initial_pagetables() > > Yes it should be moved in setup_initial_mapping(). Then I'll moved it to setup_initial_mapping() > > > > > > > > + { > > > > + unsigned long index2 = pt_index(2, page_addr); > > > > + unsigned long index1 = pt_index(1, page_addr); > > > > + unsigned long index0 = pt_index(0, page_addr); > > > > + > > > > + /* Setup level2 table */ > > > > + second[index2] = paddr_to_pte((unsigned long)first); > > > > + second[index2].pte |= PTE_TABLE; > > > > + > > > > + /* Setup level1 table */ > > > > + first[index1] = paddr_to_pte((unsigned long)zeroeth); > > > > + first[index1].pte |= PTE_TABLE; > > > > + > > > > + > > > > > > NIT: Spurious line? > > Yeah, should be removed. Thanks. > > > > > > > + /* Setup level0 table */ > > > > + if ( !pte_is_valid(&zeroeth[index0]) ) > > > > > > On the previous version, you said it should be checked for each > > > level. > > I had a terrible internet connection, and my message wasn't sent. > > No worries. > > > > > I decided not to check that l2 and l1 are used only for referring > > to > > the next page table but not leaf PTE. So it is safe to overwrite it > > each time (the addresses of page tables are the same all the time) > > You are letting the caller to decide which page-table to use for each > level. So you are at the mercy that caller will do the right thing. > > IHMO, this is a pretty bad idea because debugging page-tables error > are > difficult. So it is better to have safety in place. This is not > worth... > > and > > probably it will be better from optimization point of view to > > ignore if > > clauses. > > ... the optimization in particular when this is at boot time. I didn't think about that caller will do always the right thing so I will add the check. > > > > > And it is needed in case of L0 because it is possible that some > > addressed were marked with specific flag ( execution, read, write ) > > and > > so not to overwrite the flags set before the check is needed. > In which case you should really report an error because the caller > may > have decide to set execution flag and you don't honor. So when the > code > is executed, you will receive a fault and this may be hard to find > out > what happen. Right now, it is expected situation that the caller will try to change execution flag during the setup of initial page tables. It is possible in the currently implemented logic of the setup of initial page tables. Let me explain what I mean. The first step of setup_initial_pagetables() is to map .text, .init, .rodata with necessary flags RX, RX, R. Remaining sections will have RW flags, and to map them, setup_initial_mapping() is called for the whole range of [linker_start, linker_end] not to map them one by one thereby during this step setup_initial_mapping() will try to remap addresses ranges which overlap with .text, .init, .rodata with RW flags but it should leave with the previously set flags. > > > > > > the next page table but not leaf PTE.But here you still only > > > check > > > for a single level. > > > > > > Furthermore, I would strongly suggest to also check the valid PTE > > > is > > > the > > > same as you intend to write to catch any override (they are a > > > pain to > > > debug). > > but if load addresses and linker addresses don't overlap is it > > possible > > situation that valid PTE will be overridden? > > A bug in the code. In fact, if you add the check you would have > notice > that your existing code is buggy (see below). > > > > > > > > + { > > > > + /* Update level0 table */ > > > > + zeroeth[index0] = paddr_to_pte((page_addr - > > > > map_start) > > > > + pa_start); > > > > + zeroeth[index0].pte |= flags; > > > > + } > > > > + > > > > + /* Point to next page */ > > > > + page_addr += XEN_PT_LEVEL_SIZE(0); > > > > + } > > > > +} > > > > + > > > > +/* > > > > + * setup_initial_pagetables: > > > > + * > > > > + * Build the page tables for Xen that map the following: > > > > + * load addresses to linker addresses > > I would suggest to expand because this is not entirely what you > exactly > are doing. In fact... > > > > > + */ > > > > +void __init setup_initial_pagetables(void) > > > > +{ > > > > + pte_t *second; > > > > + pte_t *first; > > > > + pte_t *zeroeth; > > > > + > > > > + unsigned long load_addr_start = boot_info.load_start; > > > > + unsigned long load_addr_end = boot_info.load_end; > > > > + unsigned long linker_addr_start = boot_info.linker_start; > > > > + unsigned long linker_addr_end = boot_info.linker_end; > > > > + > > > > + BUG_ON(load_addr_start % (PAGE_ENTRIES * PAGE_SIZE) != 0); > > > > + if (load_addr_start != linker_addr_start) > > > > + BUG_ON((linker_addr_end > load_addr_start && > > > > load_addr_end > > > > > linker_addr_start)); > > > > > > I would suggest to switch to a panic() with an error message as > > > this > > > would help the user understanding what this is breaking. > > > > > > Alternatively, you could document what this check is for. > > I think I will document it for now as panic() isn't ready for use > > now. > > > > > > > + > > > > + /* Get the addresses where the page tables were loaded */ > > > > + second = (pte_t *)(&xen_second_pagetable); > > > > + first = (pte_t *)(&xen_first_pagetable); > > > > + zeroeth = (pte_t *)(&xen_zeroeth_pagetable); > > > > + > > > > + setup_initial_mapping(second, first, zeroeth, > > > > + LOAD_TO_LINK((unsigned > > > > long)&_stext), > > > > + LOAD_TO_LINK((unsigned > > > > long)&_etext), > > > > + (unsigned long)&_stext, > > > > + PTE_LEAF_DEFAULT); > > > > + > > > > + setup_initial_mapping(second, first, zeroeth, > > > > + LOAD_TO_LINK((unsigned > > > > long)&__init_begin), > > > > + LOAD_TO_LINK((unsigned > > > > long)&__init_end), > > > > + (unsigned long)&__init_begin, > > > > + PTE_LEAF_DEFAULT | PTE_WRITABLE); > > > > + > > > > + setup_initial_mapping(second, first, zeroeth, > > > > + LOAD_TO_LINK((unsigned > > > > long)&_srodata), > > > > + LOAD_TO_LINK((unsigned > > > > long)&_erodata), > > > > + (unsigned long)(&_srodata), > > > > + PTE_VALID | PTE_READABLE); > > > > + > > > > + setup_initial_mapping(second, first, zeroeth, > > > > + linker_addr_start, > > > > + linker_addr_end, > > > > + load_addr_start, > > > > + PTE_LEAF_DEFAULT | PTE_READABLE); > > ... this is not cover above. AFAIU, this is the one for the 1:1 > mapping. But there is no 1:1 mapping. I thought that 1:1 mapping is when the physical address is equal to 0x10020 then after MMU is enabled the virtual address will be the same 0x10020 and mapped to 0x10020. Probably I missed something but this code maps all linker addresses to correspondent load addresses and it will be 1:1 only in case when load address is equal to linker address. > > > > > > > As I said in v1, you need to use a different set of page-table > > > here. > > If I understand you correctly I have to use a different set of > > page- > > table in case when it is possible that size of Xen will be larger > > then > > PAGE_SIZE. So I added to xen.lds.S a check to be sure that the size > > fits into PAGE_SIZE. > > This is not what I was referring to. I was pointing out that second, > first, zeroeth are exactly the same for all the callers. You want to > introduce a second set of zeroeth table. You will want to do the same > for first but it is not always used. > > Otherwise, this is not going to work if Xen is loaded at a different > address than the runtime. Ok. I understand what do you mean in general but still I am not sure that I understand a particular case when I am sure that the size of Xen is no bigger then 2MB and load address is aligned on 2Mb boundary. The size of one l0 page table is 512 (1 << 9 ) entries which covers 4K (1 << 12) * 512 = 2 Mb of memory so it should be fine. All the callers in my case are trying to map addresses from [linker_start, linker_end] which is not bigger then 2 MB and is aligned on 2 MB boundary. > > That said, when I spoke with Andrew yesterday, he mentioned that your > initial goal is to support the case where Xen is loaded at the > runtime > address. I understand this simplifies a lot the code and I told him I > was OK with that. However, it would be good to document what are your > goals in each series (this is not always clear what you skip on > purpose). > > > > > > Also, where do you guarantee that Xen will be loaded at a 2MB > > > aligned > > > address? (For a fact I know that UEFI is only guarantee 4KB > > > alignment). > > There is a check in setup_initial_pagetables: > > BUG_ON(load_addr_start % (PAGE_ENTRIES * PAGE_SIZE) != 0); > > This is not very obvious the check is to confirm that Xen is probably > aligned. I would suggest to add a comment. > > Also, you might want to use XEN_PT_LEVEL_SIZE(..) to make it more > obvious what sort of alignment you are trying to enforce. I will update add the comment and use XEN_PT_LEVEL_SIZE(...) instead. > > > > > + la sp, cpu0_boot_stack > > > > + li t0, STACK_SIZE > > > > + add sp, sp, t0 > > > > + > > > > + /* > > > > + * Re-init an address of exception handler as it was > > > > overwritten with > > > > + * the address of the .L_mmu_is_enabled label. > > > > + * Before jump to trap_init save return address of > > > > enable_mmu() to > > > > + * know where we should back after enable_mmu() will > > > > be > > > > finished. > > > > + */ > > > > + mv s0, ra > > > > + lla t0, trap_init > > > > + jalr ra, t0 > > > > + > > > > + /* > > > > + * Re-calculate the return address of enable_mmu() > > > > function for case > > > > + * when linker start address isn't equal to load start > > > > address > > > > + */ > > > > + add s0, s0, t1 > > > > + mv ra, s0 > > > > + > > > > + ret > > > > > > Missing ENDPROC? > > I haven't seen the usage of ENDPROC for RISC-V so it looks like it > > is > > not necessary. > > Ok. Would the objdump be able to report the function properly then? I > know that on Arm, it was necessary report assembly function properly. It is fine for RISC-V: Disassembly of section .text: 0000000000200000 <_start>: 200000: 10401073 csrw sie,zero 200004: 6299 lui t0,0x6 ... 20003c: 00000013 nop 0000000000200040 <enable_mmu>: 200040: 0003a297 auipc t0,0x3a ... 20009e: 941a add s0,s0,t1 2000a0: 80a2 mv ra,s0 2000a2: 8082 ret ... 00000000002000b0 <__bitmap_empty>: 2000b0: 1141 addi sp,sp,-16 2000b2: e422 sd s0,8(sp) 2000b4: 0800 addi s0,sp,16 ... > > > > > > > > diff --git a/xen/arch/riscv/xen.lds.S > > > > b/xen/arch/riscv/xen.lds.S > > > > index eed457c492..e4ac4e84b6 100644 > > > > --- a/xen/arch/riscv/xen.lds.S > > > > +++ b/xen/arch/riscv/xen.lds.S > > > > @@ -179,3 +179,5 @@ SECTIONS > > > > > > > > ASSERT(!SIZEOF(.got), ".got non-empty") > > > > ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty") > > > > + > > > > +ASSERT(_end - _start <= MB(2), "Xen too large for early-boot > > > > assumptions") > > > > > ~ Oleksii > > > ~ Oleksii
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |