[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/3] xen/ppc: Set up a basic C environment



On 6/22/23 5:49 PM, Andrew Cooper wrote:
> On 22/06/2023 9:57 pm, Shawn Anastasio wrote:
>> Update ppc64/head.S to set up an initial boot stack, zero the .bss
>> section, and jump to C.
>>
>> Also refactor the endian fixup trampoline into its own macro, since it
>> will need to be used in multiple places, including every time we make a
>> call into firmware (see next commit).
>>
>> Signed-off-by: Shawn Anastasio <sanastasio@xxxxxxxxxxxxxxxxxxxxx>
> 
> Thankyou for making this change - it definitely simplifies things.

No problem.

> I've done a bit of reading around, and it's rather sad that things prior
> to Power10 don't have PC-relative addressing.  So the LOAD_IMM64()'s do
> look necessary for the stack and bss.  I guess that means we can't be
> sensibly be position independent in head.S?

Prior to the introduction of pcrel loads/stores in P10, PIC is achieved
using a Table of Contents (TOC) whose base address is kept in r2 and can
be used as a relative base to address other symbols. I don't have -fPIC
enabled in this series yet (it's in the series I was hoping to submit
after this one), so for now I'm just loading the addresses as immediates
directly.

> Also, why store 0 onto the stack ?

On the ELFv2 ABI which we're using here, sp+0 is reserved for the "back
chain" pointer which is used to store the address of the caller's stack
frame and is used to support backtraces.

At the top of the stack, we need to make sure the first back chain
pointer is zero to ensure that anything walking the stack via these
pointers eventually terminates.

>> +
>> +    /* clear .bss */
>> +    LOAD_IMM64(%r14, __bss_start)
>> +    LOAD_IMM64(%r15, __bss_end)
>> +1:
>> +    std %r11, 0(%r14)
>> +    addi %r14, %r14, 8
>> +    cmpld %r14, %r15
>> +    blt 1b
> 
> This loop is correct, except for the corner case of this patch alone,
> where the BSS is empty and this will write one word out-of-bounds.
> 
> For RISC-V, we put a temporary "char bss_tmp;" in setup.c, and I suggest
> you do the same here, deleting it at a later point when there's real
> data in the bss.

Good catch. I actually introduce a .bss variable in patch 2 of this
series, so perhaps it would make the most sense for me to move this loop
to that patch?

Also it might make sense to have an assert in the linker script checking
that sizeof(.bss) > 0, though past early bring-up an empty .bss is
probably a pretty unlikely scenario...

>> +
>> +    /* call the C entrypoint */
>> +    LOAD_IMM64(%r12, start_xen)
>> +    mtctr %r12
>> +    bctrl
> 
> Why is this a LOAD_IMM64(), and not just:
> 
>     b start_xen
> 
> ?  From the same reading around, PPC64 seems to have +/- 32M addressing
> for branches, and the entire x86 hypervisor (.init included) is within 3M.

Good question. You're right that here it's entirely unnecessary. Once we
enable -fPIC, though, the calling convention for functions changes a bit
and necessitates that in certain scenarios r12 contains the entrypoint
of the function being called.

For reference, the reason this is needed is because each function
contains a prologue that calculates the base address of its
aforementioned TOC as a relative offset from its entrypoint, and for
that to work, the entrypoint needs to be contained in r12.

There is a short path that can be taken if the TOC pointer in r2 is
already set up and the calling function is in the same module as the
function being called (and thus they are known to share a TOC), but for
head.S simply taking the long path is an easy way to ensure the callee
has a valid TOC pointer.

In any case, its inclusion in this patch before -fPIC is enabled was an
oversight on my part and I'll change it to a `bl start_xen`.

>> +
>> +    /* should never return */
>> +    trap
>>  
>>      .size start, . - start
>>      .type start, %function
>> +
>> +    .section .init.data, "aw", @progbits
>> +
>> +    /* Early init stack */
>> +    .p2align 4
>> +cpu0_boot_stack_bottom:
>> +    .space STACK_SIZE
>> +cpu0_boot_stack:
>> +    .space STACK_FRAME_OVERHEAD
> 
> Why the extra space beyond the stack?

It's space for the aforementioned back chain pointer as well as a
few other things that callees are allowed to store in their caller's
stack frame. For example, routines are allowed to store the return
address from their caller at sp+16 (unlike x86, our "call" instruction
doesn't do that for you).

The ELFv2 specification contains a nice diagram of the stack frame on
page 31 (figure 2.18, section 2.2.2.1):
https://wiki.raptorcs.com/w/images/7/70/Leabi-20170510.pdf

> Also, I'd put cpu0_boot_stack in C, and just LOAD_IMM64(x,
> cpu0_boot_stack + STACK_SIZE)

Sure, I can do that.

>> diff --git a/xen/arch/ppc/setup.c b/xen/arch/ppc/setup.c
>> new file mode 100644
>> index 0000000000..4d1b72fa23
>> --- /dev/null
>> +++ b/xen/arch/ppc/setup.c
>> @@ -0,0 +1,13 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +#include <xen/init.h>
>> +
>> +void __init noreturn start_xen(unsigned long r3, unsigned long r4,
>> +                               unsigned long r5, unsigned long r6,
>> +                               unsigned long r7)
>> +{
>> +    for ( ;; )
>> +        /* Set current hardware thread to very low priority */
>> +        asm volatile("or %r31, %r31, %r31");
> 
> Is there something magic about the OR instruction, or something magic
> about %r31?

Using the OR instruction with all three operands equal is of course a
no-op, but when using certain registers it can have a separate magic
side effect.

`or r31,31,31` is one such form that sets the Program Priority Register
to "very low" priority. Of course here where we don't have SMP going
there's not much point in using this over the standard side effect-less
no-op (`or r0,r0,r0` or just `nop`).

For a table of these magic OR forms, you can see page 836 of the Power
ISA 3.0B:
https://wiki.raptorcs.com/w/images/c/cb/PowerISA_public.v3.0B.pdf

> ~Andrew

Thanks,
Shawn




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.