[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 15/16] x86: make Xen early boot code relocatable



>>> On 31.08.16 at 16:59, <daniel.kiper@xxxxxxxxxx> wrote:
> On Thu, Aug 25, 2016 at 08:41:39AM -0600, Jan Beulich wrote:
>> >>> On 20.08.16 at 00:43, <daniel.kiper@xxxxxxxxxx> wrote:
>> > Every multiboot protocol (regardless of version) compatible image must
>> > specify its load address (in ELF or multiboot header). Multiboot protocol
>> > compatible loader have to load image at specified address. However, there
>> > is no guarantee that the requested memory region (in case of Xen it starts
>> > at 1 MiB and ends at 17 MiB) where image should be loaded initially is a 
>> > RAM
>> > and it is free (legacy BIOS platforms are merciful for Xen but I found at
>> > least one EFI platform on which Xen load address conflicts with EFI boot
>> > services; it is Dell PowerEdge R820 with latest firmware). To cope with 
>> > that
>> > problem we must make Xen early boot code relocatable and help boot loader 
>> > to
>> > relocate image in proper way by suggesting, not requesting specific load
>> > addresses as it is right now, allowed address ranges. This patch does 
>> > former.
>> > It does not add multiboot2 protocol interface which is done in "x86: add
>> > multiboot2 protocol support for relocatable images" patch.
>> >
>> > This patch changes following things:
>> >   - default load address is changed from 1 MiB to 2 MiB; I did that because
>> >     initial page tables are using 2 MiB huge pages and this way required
>> >     updates for them are quite easy; it means that e.g. we avoid spacial
>> >     cases for start and end of required memory region if it live at address
>> >     not aligned to 2 MiB,
>>
>> Should this be a separate change then, to separate possible
>> regressions due to that from such due to any other of the changes
> 
> Potentially yes. However, it should be done properly. Otherwise in
> case of revert we can break Xen relocatable infrastructure and other
> things. So, I am not sure does it pays. Anyway, I will check is it
> possible or not.

It's not so much about possibly reverting (we'd rather revert
everything if such a need arises) than bisecting.

>> here? And then I don't see why, when making the image relocatable
>> anyway, the link time load address actually still matters.
> 
> It matters for multiboot (v1) and multiboot2 compatible boot
> loaders not supporting relocatable images.

Oh, right.

>> > --- a/xen/arch/x86/boot/head.S
>> > +++ b/xen/arch/x86/boot/head.S
>> > @@ -12,13 +12,16 @@
>> >          .text
>> >          .code32
>> >
>> > -#define sym_phys(sym)     ((sym) - __XEN_VIRT_START)
>> > +#define sym_offset(sym)   ((sym) - __XEN_VIRT_START)
>>
>> In a patch already large and hence hard to review, did you really
>> need to do this rename?
> 
> I think that sym_offset() is better term here. However, if you wish
> I can leave sym_phys() as is. Though, IMO, sym_phys() name suggests
> that we are playing with physical addresses and it is not correct in
> all contexts, e.g. loot at fs_offset() (or sym_fs() as you wish).

After some more thinking about this I agree sym_phys() isn't
ideal anymore after this patch. Still, to avoid unrelated changes in
this quite hard to review patch, I'd like to ask that you keep the
name here and do just the rename in a subsequent patch.

>> Also, just to double check - all these page table setup adjustments
>> don't require reflecting in xen.efi's page table setup code?
> 
> I will check it once again but I do not think so.

Thanks for doing so. You perhaps realize that Andrew said the same
for what became 2ce5963727, and then recently we had to fix things.

>> > --- a/xen/arch/x86/boot/trampoline.S
>> > +++ b/xen/arch/x86/boot/trampoline.S
>> > @@ -54,12 +54,20 @@ trampoline_gdt:
>> >          /* 0x0028: real-mode data @ BOOT_TRAMPOLINE */
>> >          .long   0x0000ffff
>> >          .long   0x00009200
>> > +        /*
>> > +         * 0x0030: ring 0 Xen data, 16 MiB size, base
>> > +         * address is computed during runtime.
>> > +         */
>> > +        .quad   0x00c0920000001000
>>
>> This does not look like it covers 1Mb, it's more like 1Mb+4k-1.
> 
> I have checked it once again. It covers 16 MiB as required:
>   4 KiB * 0x1000 = 0x1000000 (no taking into account relocation).

I'm sorry, but no. The raw limit value taken from that descriptor
is 0x1000, and the G bit is set. That makes the actual limit 0x1000fff.

> By the way, it looks that we can define this segment as 0x200000 - 0x1000000.
> However, then sym_fs() should be defined as:
> 
> #define sym_fs(sym)    %fs:(sym_offset(sym) - XEN_IMG_OFFSET)
> 
> Does it make sense?

If you feel like it's worth it ...

>> >          .pushsection .trampoline_rel, "a"
>> >          .long   trampoline_gdt + BOOT_PSEUDORM_CS + 2 - .
>> >          .long   trampoline_gdt + BOOT_PSEUDORM_DS + 2 - .
>> >          .popsection
>> >
>> > +GLOBAL(xen_img_load_base_addr)
>> > +        .long   0
>>
>> I've yet to understand the purpose of this symbol, but in any case:
>> If the trampoline code doesn't reference it, why does it get placed
>> here?
> 
> This symbol was used by code which unconditionally relocated Xen image
> even if boot loader did work for us. I removed most of this code because
> we agreed that if boot loader relocated Xen image then we do not have to
> relocate it higher once again. If you are still OK with this idea I can go
> further, as I planned, and remove all such remnants from next release.
> However, it looks that then I have to store load address in xen_phys_start
> from 32-bit assembly code. So, it will be nice if I can define it as
> "unsigned int" instead of "unsigned long". Is it safe?

I don't see why you shouldn't be able to store into the low 32 bits of
the variable without altering the high ones (which are all zero).

>> >          s = (boot_e820.map[i].addr + mask) & ~mask;
>> >          e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
>> > -        s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
>> >          if ( (boot_e820.map[i].type != E820_RAM) || (s >= e) )
>> >              continue;
>>
>> This means you now map memory between 2Mb and 16Mb here. Is
>> this necessary?
> 
> Without this change Xen relocated by boot loader crashes with:
> 
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'l2e_get_flags(l2e) & _PAGE_PRESENT' failed at x86_64/mm.c:902
> 
> So, it must be mapped.

That's too little context to be useful for understanding the full
background.

>> > --- a/xen/arch/x86/xen.lds.S
>> > +++ b/xen/arch/x86/xen.lds.S
>> > @@ -55,7 +55,7 @@ SECTIONS
>> >    __2M_text_start = .;         /* Start of 2M superpages, mapped RX. */
>> >  #endif
>> >
>> > -  . = __XEN_VIRT_START + MB(1);
>> > +  . = __XEN_VIRT_START + XEN_IMG_OFFSET;
>> >    _start = .;
>> >    .text : {
>> >          _stext = .;            /* Text and read-only data */
>> > @@ -257,12 +257,14 @@ SECTIONS
>> >    .reloc : {
>> >      *(.reloc)
>> >    } :text
>> > -  /* Trick the linker into setting the image size to exactly 16Mb. */
>> >    . = ALIGN(__section_alignment__);
>> > +#endif
>> > +
>> > +  /* Trick the linker into setting the image size to exactly 16Mb. */
>> >    .pad : {
>> >      . = ALIGN(MB(16));
>> > +    __end_of_image__ = .;
>>
>> I see that you use this symbol in xen/arch/x86/Makefile, but I again
>> don't follow why this logic needs to change.
> 
> Current logic does not work because it gets wrong address from xen/xen-syms.
> This way boot loader allocates incorrect, usually huge, buffer for Xen image
> (wait, there is a chance that this is a fix for issues related to 2 MiB 
> mappings
> found by Andrew). I do not see simple reliable fix for current solution. So,
> I introduced __end_of_image__ and look for its address. This is much better
> method to establish end of image address then previous one. However, I can
> agree that this could be introduced in separate patch.

Yes please, as that also will (hopefully) result in the need for the
change getting properly explained.

>> > --- a/xen/include/asm-x86/page.h
>> > +++ b/xen/include/asm-x86/page.h
>> > @@ -288,7 +288,7 @@ extern root_pgentry_t
>> > idle_pg_table[ROOT_PAGETABLE_ENTRIES];
>> >  extern l2_pgentry_t  *compat_idle_pg_table_l2;
>> >  extern unsigned int   m2p_compat_vstart;
>> >  extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES],
>> > -    l2_bootmap[L2_PAGETABLE_ENTRIES];
>> > +    l2_bootmap[4*L2_PAGETABLE_ENTRIES];
>>
>> Why do you need 4 of them? I can see why one might not suffice,
>> but two surely should?
> 
> In worst case we need three. One for l1_identmap hook and two
> for Xen image mapping. So, I am not sure that it pays to complicate
> assembly mapping code just to save just 1 page. Additionally, you
> should remember that l2_bootmap is freed after init.

Well, I was asking, not saying this needs to change. And you
pointing out the l1_identmap aspect makes clear this is fine as is.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.