[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 4/5] x86/boot: Simplify pagetable manipulation loops
On 17.01.2020 21:42, Andrew Cooper wrote: > For __page_tables_{start,end} and L3 bootmap initialisation, the logic is > unnecesserily complicated owing to its attempt to use the LOOP instruction, > which results in an off-by-8 memory address owing to LOOP's termination > condition. > > Rewrite both loops for improved clarity and speed. > > Misc notes: > * TEST $IMM, MEM can't macrofuse. The loop has 0x1200 iterations, so pull > the $_PAGE_PRESENT constant out into a spare register to turn the TEST into > its %REG, MEM form, which can macrofuse. > * Avoid the use of %fs-relative references. %esi-relative is the more common > form in the code, and doesn't suffer an address generation overhead. > * Avoid LOOP. CMP/JB isn't microcoded and faster to execute in all cases. > * For a 4 interation trivial loop, even compilers unroll these. The > generated code size is a fraction larger, but this is init and the asm is > far easier to follow. > * Reposition the l2=>l1 bootmap construction so the asm reads in pagetable > level order. > > No functional change. > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> with two remarks/questions, but leaving it up to you whether you want to adjust the code: > --- a/xen/arch/x86/boot/head.S > +++ b/xen/arch/x86/boot/head.S > @@ -662,11 +662,17 @@ trampoline_setup: > mov %edx,sym_fs(boot_tsc_stamp)+4 > > /* Relocate pagetables to point at Xen's current location in memory. > */ > - mov $((__page_tables_end-__page_tables_start)/8),%ecx > -1: testl $_PAGE_PRESENT,sym_fs(__page_tables_start)-8(,%ecx,8) > + mov $_PAGE_PRESENT, %edx > + lea sym_esi(__page_tables_start), %eax > + lea sym_esi(__page_tables_end), %edi > + > +1: testb %dl, (%eax) /* if page present */ When it's an immediate, using TESTB is generally helpful because there's no (sign- or whatever-)extended immediate form of it. When using a register, I think it would generally be better to use native size, even if for register reads the partial register access penalty may (today) be zero. > @@ -701,22 +707,27 @@ trampoline_setup: > cmp %edx, %ecx > jbe 1b > > - /* Initialize L3 boot-map page directory entries. */ > - lea > __PAGE_HYPERVISOR+(L2_PAGETABLE_ENTRIES*8)*3+sym_esi(l2_bootmap),%eax > - mov $4,%ecx > -1: mov %eax,sym_fs(l3_bootmap)-8(,%ecx,8) > - sub $(L2_PAGETABLE_ENTRIES*8),%eax > - loop 1b > - > - /* Map the permanent trampoline page into l{1,2}_bootmap[]. */ > + /* Map 4x l2_bootmap[] into l3_bootmap[0...3] */ > + lea __PAGE_HYPERVISOR + sym_esi(l2_bootmap), %eax > + mov $PAGE_SIZE, %edx > + mov %eax, 0 + sym_esi(l3_bootmap) > + add %edx, %eax > + mov %eax, 8 + sym_esi(l3_bootmap) > + add %edx, %eax > + mov %eax, 16 + sym_esi(l3_bootmap) > + add %edx, %eax > + mov %eax, 24 + sym_esi(l3_bootmap) It took me a moment to realize the code is correct despite there not being any mention of PAGE_SIZE between each of the MOVs. As you don't view code size as a (primary) concern, perhaps worth using add $PAGE_SIZE, %eax everywhere, the more that this has a special, ModR/M-less encoding? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |